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A  NOTE  ON  ESTIMATING  CONTINUOUS 
TIME  DECISION  MODELS 


by 

* 

R.  P,  Tiost,  Philip  Lurie  and  Edward  Berger 


1.  INTRODUCTION 

Continuous  time  decision  models  have  a  long  history  in  economics.  In 
fact,  the  early  work  of  Mincer  [1962]  implies  that  a  woman’s  labor  force 
participation  decision  is  made  with  a  continuous  time  horizon  in  mind. 

More  recently.  Mincer  and  Polacheck  [1974  and  1978],  Mincer  and  Ofek  [1980], 
and  Sandell  and  Shapiro  [1978]  analyze  the  length  of  time  women  remain  out  of 
the  labor  force.  Sandell  [1977]  examines  the  determinants  of  the  number  of 
years  women  work  after  the  birth  of  their  first  child,  and  in  a  stock  adjustment 
model  of  fertifility  decisions,  Hyman  [1980]  examines  the  length  of  time 
between  births.  There  are  also  several  studies  that  seem  suited  to  continuous 
time  analysis  but  are  estimated  with  discrete  time  choice  models.  For  example, 
Shapiro  and  Mctt  [1979]  estimate  a  discrete  time  choice  model  of  women's 
post  child  labor  force  participation  rates.  One  could  also  analyze  post 
child  labor  force  participation,  as  Mincer  and  Polacheck  [1974]  do,  by  looking 
at  a  woman's  interval  of  nonlahor  force  participation  following  the  birth  of  a 
child. 

All  of  the  above  studies  choose  (or  imply)  as  the  dependent  variable  of 
interest  some  measure  of  continuous  time,  whether  it  be  in  percentage  terms 
or  actual  years.  In  some  of  these  papers  (e.g..  Mincer  and  Polacheck  [1974] 
and  Hyman  [1980])  this  dependent  variable  is  studied  by  looking  at  its  mean 

* 
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value  (actual  or  predicted)  in  various  age*  and  education  subgroups.  In 
other  studies  (e.g.,  Mincer  and  Ofek  [1980]  and  Shapiro  and  Mott  [1979]) 
the  determinants  of  the  dependent  variable  are  estimated  with  regression 
analysis.  While  the  estimation  of  continuous  time  models  is  a  step  in  the 
right  direction  (away  from  discrete  time  choice  models),  the  purpose  of  our 
note  is  to  point  out  that  the  empirical  results  in  these  otherwise  excellent 
theoretical  papers  may  be  biased.  The  source  of  this  bias  lies  in  the  way 
censored  observations  are  handled  in  the  empirical  analysis.  The  rest  of 
the  paper  is  divided  into  the  following  sections.  In  Section  Two  we  discuss 
a  potential  source  of  bias  in  previous  studies.  In  Section  Three  we  propose 
an  alternative  method  for  estimating  time  decision  models  and  in  Section 
Four,  we  consider  two  different  applications  of  this  method.  Section  Five 
contains  the  conclusions. 

II.  A  SOURCE  OF  BIAS:  CENSORED  OBSERVATIONS 

All  the  studies  cited  in  the  introduction  derive  and  estimate  life 
cycle  decision  models.  One  variable  of  interest  in  these  papers  is  some  sort 
of  "length  of  time”  variable.  For  example,  Mincer  and  Polacheck  [1974] 
study,  among  other  things,  the  length  of  time  women  remain  out  of  the  labor 
force  following  the  birth  of  the  first  child.  Mincer  and  Ofek  [1980]  estimate 
the  determinants  of  a  woman’s  duration  of  unemployment.  Hyman  [1980] 
estimates  a  stock  adjustment  fertility  model  where  one  variable  of  interest 
is  the  length  of  time  between  children^  (i.e.,  childspacing).  The  authors* 


Although  Hyman's  theoretical  interest  include  the  decision  regarding  the 
desired  spacing  between  children,  his  actual  empirical  analysis  only  looks 
at  the  proportion  of  respondents  who  had  one  child  during  the  following 
year,  given  that  they  desired  to  have  one  more  child.  However,  child  spacing 
in  the  usual  meaning  of  the  word  is  implied  in  his  paper  and  is  worth 
investigating . 


jus t  if ica t ion  for  studying  these  dependent  variables  is  well  founded  in 
economic  theory,  and  we  have  no  quarrel  with  their  theoretical  models.  Our 
concern  is  with  the  empirical  section  and  in  particular  with  the  manner  in 
which  censored  observations  are  handled  in  the  analysis.  A  simple  example 
will  demonstrate  our  point* 

Suppose  we  want  to  estimate  the  average  length  of  time  women  wait 

before  returning  to  work  following  the  birth  of  their  first  child.  For 

simplicity,  assume  we  have  a  sample  of  ten  women  who  all  gave  birth  on  the 

same  day.  Let  the  observations  v.  be: 

1 

1,  1,  2,  2,  3,  3,  4,  4,  3,  5 

where  y^  is  the  number  of  years  elapsed  before  the  women  return  to  work. 

If  we  analyze  these  data  five  years  after  the  children  are  bom,  we  can  simply 
average  the  ten  observations  to  get  an  unbiased  estimate  of  the  mean. 
Suppose,  as  is  often  the  case  with  panel  data,  we  do  not  have  "completed1' 
values  of  for  all  the  observations.  This  would  be  the  case,  for  example, 
if  we  have  data  for  only  the  first  three  years  following  the  birth  of  these 
children.  In  this  case  we  mil  have  six  observations  where  the  length  of  time 
is  observed  (1 ,1 ,2, 2, 3 ,3)  and  four  where  the  length  of  time  is  censored 
(the  values  4,  4,  3,  3  x^hichof  course  are  unobserved)  at  3  years.  Here  the 
term  censored  means  that  all  we  know  is  these  four  women  wait  at  least  three 
years  before  returning  to  work .  1  f  we  simply  average  the  six  unce.nsorcd 

observations,  we  will  underestimate  the  mean  of  v..  Even  i f  we  include  the 

'  l 

censored  observations  and  enter  a  value  of  "3"  for  them,  we  will  still 
underestimate  the  mean  value  of  y..  These  same  problems  exist  if  we  wish 

l 

to  use  regression  analysis  to  measure  the  impact  of  an  exogenous  variable  X 

i 
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1 

on  y | ,  i.e.,  we  will  underestimate  E(Y^|X^).  Despite  these  obvious  biases,  it 
is  surprising  that  none  of  the  previously  cited  papers  discuss  (or  take 
account  of)  this  "censored  data"  problem.  As  we  show  in  the  next  section, 
there  is  a  simple  method  for  handling  this  problem,  that  to  our  knowledge  has 
never  been  applied  to  economic  data. 

HI.  A  METHOD  FOR  ESTIMATING  CONTINUOUS  TIME  MODELS  WHEN  THE  DATA  ARE 

CENSORED 

The  estimation  of  continuous  time  models  have  a  long  history  in  the 
biostat  ist  ical.  literature.  They  have  been  used  extensively  in  the  biomedical 
sciences  in  the  general  area  of  patient  survival.  Here  the  problem  has  been 
to  estimate  the  probability  that  a  patient  survives  beyond  time  T.  A  plot  of 
these  probabilities  as  a  function  of  time,  i.e., 

S(t)  =  P(T  >  t),  (1) 

is  called  a  "survival  curve." 

Kaplan  and  Meir  [1958]  were  the  first  to  derive  a  nonparamet ric  maximum 

likelihood  estimate  of  the  true  survival  function  in  the  presence  of 

censoring.  However,  the  Kaplan-Meir  method  only  estimates  "unadjusted" 

survival  curves.  That  Is,  the  survival  curves  are  not  adjusted  for  exogenous 

characteristics.  It  was  not  until  1972  that  a  nonparamet ric  method  became 

available  for  handling  censored  observations  while  adjusting  for  factors 

(i.e.,  exogenous  variables)  which  may  affect  the  probability  of  survival. 

This  method  was  first  proposed  by  Cox  [1972]  and  is  known  as  the  Cox 

o 

Regression  model.  The  Cox  model  expresses  a  hazard  functLon  as 


2 

The  hazard  function  h(t).  Is  defined  as  the  conditional  probability  of  a 
failure  In  the  Interval  (t,t+dt),  given  survival  to  time  t.  That  is, 

h(t)dt  -  P(t<T<t+dt|T>t). 
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hz(t) 


ho(t)l‘ 


B 1  Z 


where  Z  is  a  vector  of  exogenous  variables,  £  is  a  vector  of  unknown 
coefficients  and  h^Ct)  Lti  assumed  fixed  and  independent  of  Z,  but  otherwise 
completely  unspecified.  Note  that  h^(t)  corresponds  to  the  hazard  function 
for  the  situation  when  Z  =  0.  The  survival  function  S^Ct)  refers  to  the 
case  where  the  exogenous  variables  2-0  and  is  expressed  as 


s0(t)  = 


-/  hn(x)dx 
,  0  0 


Since  the  model  assumes  proportional  hazards  i.e.,  the  hazard  ratio  for 
any  two  values^of  Z  is  independent  of  tine,  then 

Sl(t)  =  (SoU))6*5  Zl  ’ 

where  S^(t)  is  the  survival  curve  for  an  individual  with  exogenous  variables 
Z^.  Cox  [1972]  shows  how  to  estimate  the  vector  3  and  the  function  h^(t) 
with  a  maximum  likelihood  approach. 

In  the  next  section  we  demonstrate  the  usefulness  of  the  Cox  Regression 
technique  by  estimating  an  unemployment  duration  equation  and  a  childspacing 
ecuat i on. 


IV.  TWO  EMPIRICAL  EXAMPLES  KITH  CENSORED  OBSERVATIONS 

In  tliis  section  we  demonstrate  the  feasibility  of  the  Cox  model  with 
two  empirical  examples.  The.  first  application  estimates  an  equation  where  the 
dependent  variable  is  a  woman’s  duration  of  unemployment  following  the  birth  ot 
her  first  child.  The  second  application  concerns  the  length  of  time  a  family 
waits  before  having  their  first  child. 

i 

\ 
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A .  Duration  of  Unemployment t 


As  we  noted  earlier,  many  have  studied  the  labor  force  participation 
decisions  of  women.  Some,  like  Nelson  [1977]  and  Heckman  and  Willis  [1977] 
study  it  in  the  context  of  discrete  time,  while  others  like  Mincer  and 
Ofek  [1980]  analyze  it  in  continuous  time.  In  the  present  paper  we  show 
how  the  Cox  regression  model  can  be  used  to  estimate  the  probability  that  a 
woman  returns  to  work  (following  the  birth  of  her  first  child)  after  one 
year,  two  years,  three  years,  etc.  This  method  takes  account  of  the  censored 
observations  in  the  sample  and  will  yield  an  unbiased  estimate  of  the 
mean  duration.  We  use  the  1973  wave  of  the  Parnes  NLS  data  on  young  women 
to  estimate  the  unemployment  duration  equation.  Table  1  gives  mean  values 
for  the  six  exogenous  variables  and  the  mean  value  of  unemployment  duration 
(the  dependent  variable).  Note  that  by  ignoring  the  fact  that  79  of  the 
duration  observations  are  censored,  one  would  underestimate  the  mean  duration 
of  unemployment  for  these  women  by  16  percent.  A  consistent  *  estimate  of  mean 

CO 

duration  is  3.25  and  is  easily  calculated  from  E(T)  =*  /  3(t)dt. 

0 

Table  2  gives  the  maximum  likelihood  estimates  for  the  Cox  model.  To 
see  how  these  coefficients  are  interpreted,  recall  that  the  Cox  model  expresses 
a  "survival”  curve  as: 


s<t)  -  <s0(O) 


exp(8  fX) 


(2) 


Consequently,  a  positive  sign  on  the  coefficient  for  exogenous  variable  Z 
means  that  as  Z  is  larger,  the  women  will  decide  to  go  back  to  work  sooner. 

A  negative  sign  means  that  as  Z  is  larger,  the  women  will  wait  longer  before 
returning  to  work. 
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TABLE  1 


DEFINITION  AND  MEAN  OF  VARIABLES 


Variable  Mean 

Dependent  variable  3.25* 

EDUC  11.95 

HUSINC  $9274.55 

DSMSA  .39 

DRACE  .245 

UNEMP  7.7 

Age  19.54 


Definition 

Defined  in  table  2 

Wife' s  education 

Husband's  income 

Dummy  =  1  if  in  SMSA 

Dummy  =  1  if  nonwhite 

Unemnloymen t  rate  of 
county 

Age  of  women  when  child 
was  born 


*This  mean  takes  account  of  the  censored  values.  It  is  calculated 
as  the  area  under  the  survival  curve.  If  we  calculate  a  simple  mean  of 
the  212  observations  we  get  a  value  of  2.72.  Hence,  by  ignoring  the  censoring 
problem  we  underestimate  the  duration  of  unemployment  by  16  percent. 


i 


» 


TABLE  2 


COEFFICIENT 

ESTIMATES  IN 

THE  COX 

REGRESSION  MODEL 

(Dependent  Variable  = 
re-entering  the  labor  force 

Leng  th 
after 

of  time  to 

having  first  child) 

Variable 

Coefficient 

Standard 

deviation 

X2* 

_A 

EDUC 

.1354 

.0674 

4.036 

HUSINC 

-.0000265 

.0000206 

1.654 

DSMSA 

.1326 

.1878 

.499 

DRACE 

.0586 

.2092 

.078 

UNEMP 

-.0127 

.0155 

.667 

AGE 

-.02195 

.0465 

.223 

Log  likelihood  (all  Betas  =  0)  =  -  650. 

Log  likelihood  (all  MLE)  =  -647*09 
Number  of  observations  =  212 
Number  of  censored  values  =  79 

2 

*Using  a  significance  level  of  .05,  any  x^  value  greater  than  3.84 
is  considered  significant. 
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For  the  reef  f  it*  ieiiL  s  in  TaU  •  e  2  then,  ve  ooe  that  as  t  V.  :  woman's 
education  increases,  she  will  go  back  to  work  sooner.  As  her  husband's 
income,  her  age  or  the  unemployment  rate  increase  however,  she  will  wait 
longer  before  returning  to  work.  The  only  significant  coefficient  at  the 
.05  level  is  the  education  variable. 

To  see  what  effect  education  has  on  the  time  at  which  a  woman  returns 
to  the  labor  force,  recall  that  "survival”  in  our  model  means  a  woman  did  r. o t 
go  back  to  work  by  time  t.  Column  2  of  Table  3  gives  the  survival  probabilities 
for  exogenous  variables  =  (1)  EDUC  =  12,  <2)  HUS  TNG  -  $9200,  (3)  DSMSA  T  0, 

(4)  DRACE  =  0,  (5)  UN EM?  =  77,  and,  (6)  AGE  *  19.5.  Column  3  of  Table  3 
gives  the  survival  probabilities  for  exogenous  variables  Z0,  where  2  is 
the  same  as  Z^  except  EDUC  =  16  rather  than  12.  Table  3  tells  us  that  the 
probability  of  a  woman  with  characteristics  cf  not  going  back  to  work 
within  2  years  is  .63.  For  a  woman  with  characteristics  Z^,  this  probability 
is  .45. 

B .  C h i  ldspac  ing  __F quat  i ori 

A  second  example  that  demonstrates  the  usefulness  of  the  Cox  technique 
in  the  estimation  of  economic  decision  equations  is  found  in  the.  work  of 
Hyman  [I980J.  One  variable  oi  interest  in  Hyman's  paper  is  childspacing, 
where  he  defines  "chil d spac ing"  as  the  proportion  of  respondents  who  had  one 
child  during  the  following  year,  given  that  they  desired  to  have  one  more  child. 
In  our  paper  we  define  childspacing  as  the  length  of  time  between  children  and 
estimate  an  equation  where  the  dependent  variable  is  t  he  length  of  time  von* -a 
wait  before  having  their  first  child.  We  again  use  the  1973  wave  of  the 
Fames  NLS  data  on  young  women.  The  dependent  variable  will  be  censored  for 
chose  couples  who  did  not  have  their  first  child  by  1973.  To  handle  this 
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TABLE  3 


Time 


SURVIVAL*  PROBABILITIES  EVALUATED  AT  Z1  AND  Z 2 


Survival  at  Z^ 


Survival  at  Z ^ 


0 

1.00 

1.00 

1 

.75 

.61 

2 

.63 

.45 

3 

.51 

.32 

4 

.44 

.24 

5 

.38 

.19 

6 

.34 

.16 

7 

.29 

.12 

V- 

(1)  EDUC  =  12,  (2) 

HUSINC  =  $9200, 

(3) 

DSMSA  = 

0, 

(4)  DRACE  =  0,  (5) 

UNEMP  =  77,  and 

,  (6) 

AGE  = 

19.5 

Z2: 

Same  as  ,  except 

EDUC  =  16. 

*  Surv 

iving  to  time  t  is 

defined  as  not 

returning  to 

work  by 

to  and  including)  time  t. 


censoring  problem,  the  Cox  regression  model  is  a  natural  choice  of  estimation 
techniques  and  is  the  one  we  use.  Table  4  gives  the  maximum  likelihood 
ps t  imates . 

For  the  coefficients  in  Table  4  we  see  that  as  the  wife's  age,  education, 
or  IQ  increases,  or  as  husbands  income  goes  up,  the  couple  will  wait  longer 
before  having  the  first  child.  Also,  Table  1  tells  us  that  couples  who  live 
in  rural  regions  will  wait  longer  than  similar  couples  who  live  in  urban 
areas  before  having  the  first  child.  However,  the  only  significant 
coefficients  are  the  education  and  income  coefficients. 

For  this  application  "survival"  means  the  family  did  not  have  a  baby 
by  any  given  time  t.  Column  2  of  Table  5  gives  the  "survival"  probabilities 
for  exogenous  variables  Z^ :  (1)  Age  =  20,  (2)  Dummy  Rural  =  0,  (3)  IQ  =  100, 

(4)  Educ.  *  12,  and,  (5)  Income  =  $5,000.  Column  2  tells  us  that  probability 
of  not  have  a  child  by  two  years  is  .425.  A  similar  interpretation  holds 
for  the  rest  of  column  2  in  Table  5. 

Column  3  of  Table  5  gives  the  "survival”  probabilities  for  exogenous 
variables  Z^:Z,y  is  the  same  as  2^  ,  except  income  =  $10,000  rather  than  $5,00r>. 
Notice  that  the  probabilities  arc  uniformly  (because  of  proportional  hazard) 
higher  for  Z^-  This  means  that  as  income  goes  up  (from  $5,000  to  $10,000), 
couples  wait  longer  before  having  the  first  child.  The  probability  of  not 
having  a  child  by  any  given  year  is  greater  for  couples  with  husband1 s  income 
of  $10,000  than  it  is  for  similar  couples  with  husband1 s  income  of  $5,000. 

A  similar  interpretation  holds  for  Column  4  of  Table  5. 

In  this  section  we  gave  two  examples  of  how  the  Cox  model  can  be  used 
to  estimate*  economic  time  decision  models.  Our  purpose  in  this  empirical 
section  was  not  to  re-do  previous  studies,  but  to  demonstrate  the  applicability 
and  feasibility  of  the  Cox  regression  technique.  It  is  also  hoped  that  we  make 
others  aware  of  the  censored  data  problem  in  future  analyses. 


-11- 


Table  4 


Coefficient  Estimates  in  the 
Cox  Regression  Model 

(Dependent  Variable  =  Number  of  Years  Before 
Having  First  Child) 


Variable 

Coefficient 

Standard  Deviation 

ykk 

x 

Age  at  marriage 

-. 000218 

.0267 

0.00 

Dummy  (=1  if  Rural) 

-.1072 

.0873 

1.51 

IQ  of  Wife 

-.0014 

.0031 

.21 

Education  of  Wife 

-.08899 

.0308 

8.37 

Husband’s  Annual  Income 

-.000064 

.000015 

17.73 

Log  Likelihood  (all  Betas 

*  0)  *  -3692.19 

Log  Likelihood  (at  MLE)  = 

-3671.59 

Number  of  observations  -  681 

Number  of  couples  having  a  child  =  620 

Number  of  censored  values  =  61 


* 

Since  we  only  had  annual  data,  our  dependent  variables  took  on  discrete 
values  1,  2,  3,  etc.  Of  course,  the  Cox  model  can  easily  handle  a  continuous 
dependent  variable. 

**  2 

Al  1  X**  are  with  1  degree  of  freedom.  Using  a  significance  level  of  .05, 
any  X*  value  greater  than  3.84  is  considered  significant. 


Table  5 


Surviva 1 

Frobabil it i es 
\  ‘•,nd 

Evaluated  at 

h 

Time 

* 

Survival  at 

* 

Survival  at 

* 

Z^  Survival  at  Z^ 

0 

1 

1 

1 

1 

.589 

.681 

.667 

2 

.425 

.537 

.519 

3 

.276 

.393 

.373 

4 

.194 

.304 

.285 

3 

.156 

.260 

.241 

6 

.119 

.213 

.196 

7 

.080 

.160 

.144 

8 

.073 

.150 

.135 

9 

.064 

.136 

.122 

10 

.044 

.104 

.091 

11 

.044 

.104 

.091 

V 

(1) 

Age  *  20, 

(2)  Du- 

.my  Rural  -  0, 

(3)  IQ  ~  100,  (4)  Education 

(5) 

Husband ' s 

Annual 

Income  =  $5,000. 

V 

Sane  as  Z^  ,  e: 

ccopt  hu 

isbnnds  annual 

income  -  $10,000. 

Z  :  Same  as  Z^9  except  Education  -  lb. 

^'Surviving"  to  time  t  is  defined  as  not  having  a  baby  by  (i.e.,  up  to  and 
including)  time  t. 
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V.  CONCLUSIONS 


In  this  paper  we  pointed  out  a  potential  source  of  bias  in  the 
estimation  of  continuous  time  decision  equations.  This  bias  will  exist 
whenever  there  are  censored  observations  in  the  data  and  estimation  techniques 
such  as  least  squares  are  used.  To  correct  for  this  bias  one  has  to  use  an 


estimation  technique,  such  as  the  Cox  regression  model,  which  takes  censored 

b 

observations  into  account.  We  demonstrated  the  usefulness  of  the  Cox 


isefulness  of  the  Cox 


model  by  estimating  an  unemployment  duration  equation  and  a  childspacing 
equation.  We  think  that  the  Cox  model  performs  adequately  and  yields 
reasonable  estimates. 
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